1,853 research outputs found
Natural Compression for Distributed Deep Learning
Modern deep learning models are often trained in parallel over a collection
of distributed machines to reduce training time. In such settings,
communication of model updates among machines becomes a significant performance
bottleneck and various lossy update compression techniques have been proposed
to alleviate this problem. In this work, we introduce a new, simple yet
theoretically and practically effective compression technique: {\em natural
compression (NC)}. Our technique is applied individually to all entries of the
to-be-compressed update vector and works by randomized rounding to the nearest
(negative or positive) power of two, which can be computed in a "natural" way
by ignoring the mantissa. We show that compared to no compression, NC increases
the second moment of the compressed vector by not more than the tiny factor
\nicefrac{9}{8}, which means that the effect of NC on the convergence speed
of popular training algorithms, such as distributed SGD, is negligible.
However, the communications savings enabled by NC are substantial, leading to
{\em - improvement in overall theoretical running time}. For
applications requiring more aggressive compression, we generalize NC to {\em
natural dithering}, which we prove is {\em exponentially better} than the
common random dithering technique. Our compression operators can be used on
their own or in combination with existing operators for a more aggressive
combined effect, and offer new state-of-the-art both in theory and practice.Comment: 8 pages, 20 pages of Appendix, 6 Tables, 14 Figure
LASR-Guided Stellar Photometric Variability Subtraction: The Linear Algorithm For Significance Reduction
We develop a technique for removing stellar variability in the light curves
of -Scuti and similar stars. Our technique, which we name the Linear
Algorithm for Significance Reduction (LASR), subtracts oscillations from a time
series by minimizing their statistical significance in frequency space. We
demonstrate that LASR can subtract variable signals of near-arbitrary
complexity and can robustly handle close frequency pairs and overtone
frequencies. We demonstrate that our algorithm performs an equivalent fit as
prewhitening to the straightforward variable signal of KIC 9700322. We also
show that LASR provides a better fit to seismic activity than prewhitening in
the case of the complex -Scuti KOI-976.Comment: 9 pages, 5 figures, accepted for publication in Astronomy &
Astrophysics. Pseudocode and github link to code included in manuscrip
Maestro: Uncovering Low-Rank Structures via Trainable Decomposition
Deep Neural Networks (DNNs) have been a large driver and enabler for AI
breakthroughs in recent years. These models have been getting larger in their
attempt to become more accurate and tackle new upcoming use-cases, including
AR/VR and intelligent assistants. However, the training process of such large
models is a costly and time-consuming process, which typically yields a single
model to fit all targets. To mitigate this, various techniques have been
proposed in the literature, including pruning, sparsification or quantization
of the model weights and updates. While able to achieve high compression rates,
they often incur computational overheads or accuracy penalties. Alternatively,
factorization methods have been leveraged to incorporate low-rank compression
in the training process. Similarly, such techniques (e.g.,~SVD) frequently rely
on the computationally expensive decomposition of layers and are potentially
sub-optimal for non-linear models, such as DNNs. In this work, we take a
further step in designing efficient low-rank models and propose Maestro, a
framework for trainable low-rank layers. Instead of regularly applying a priori
decompositions such as SVD, the low-rank structure is built into the training
process through a generalized variant of Ordered Dropout. This method imposes
an importance ordering via sampling on the decomposed DNN structure. Our
theoretical analysis demonstrates that our method recovers the SVD
decomposition of linear mapping on uniformly distributed data and PCA for
linear autoencoders. We further apply our technique on DNNs and empirically
illustrate that Maestro enables the extraction of lower footprint models that
preserve model performance while allowing for graceful accuracy-latency
tradeoff for the deployment to devices of different capabilities.Comment: Under revie
Improving Performance of Private Federated Models in Medical Image Analysis
Federated learning (FL) is a distributed machine learning (ML) approach that
allows data to be trained without being centralized. This approach is
particularly beneficial for medical applications because it addresses some key
challenges associated with medical data, such as privacy, security, and data
ownership. On top of that, FL can improve the quality of ML models used in
medical applications. Medical data is often diverse and can vary significantly
depending on the patient population, making it challenging to develop ML models
that are accurate and generalizable. FL allows medical data to be used from
multiple sources, which can help to improve the quality and generalizability of
ML models. Differential privacy (DP) is a go-to algorithmic tool to make this
process secure and private. In this work, we show that the model performance
can be further improved by employing local steps, a popular approach to
improving the communication efficiency of FL, and tuning the number of
communication rounds. Concretely, given the privacy budget, we show an optimal
number of local steps and communications rounds. We provide theoretical
motivations further corroborated with experimental evaluations on real-world
medical imaging tasks
Handling Data Heterogeneity via Architectural Design for Federated Visual Recognition
Federated Learning (FL) is a promising research paradigm that enables the
collaborative training of machine learning models among various parties without
the need for sensitive information exchange. Nonetheless, retaining data in
individual clients introduces fundamental challenges to achieving performance
on par with centrally trained models. Our study provides an extensive review of
federated learning applied to visual recognition. It underscores the critical
role of thoughtful architectural design choices in achieving optimal
performance, a factor often neglected in the FL literature. Many existing FL
solutions are tested on shallow or simple networks, which may not accurately
reflect real-world applications. This practice restricts the transferability of
research findings to large-scale visual recognition models. Through an in-depth
analysis of diverse cutting-edge architectures such as convolutional neural
networks, transformers, and MLP-mixers, we experimentally demonstrate that
architectural choices can substantially enhance FL systems' performance,
particularly when handling heterogeneous data. We study 19 visual recognition
models from five different architectural families on four challenging FL
datasets. We also re-investigate the inferior performance of convolution-based
architectures in the FL setting and analyze the influence of normalization
layers on the FL performance. Our findings emphasize the importance of
architectural design for computer vision tasks in practical scenarios,
effectively narrowing the performance gap between federated and centralized
learning. Our source code is available at
https://github.com/sarapieri/fed_het.git.Comment: to be published in NeurIPS 202
Transcranial direct current stimulation (tDCS) modulation of picture naming and word reading:A meta-analysis of single session tDCS applied to healthy participants
Recent reviews quantifying the effects of single sessions of transcranial direct current stimulation (or tDCS) in healthy volunteers find only minor effects on cognition despite the popularity of this technique. Here, we wanted to quantify the effects of tDCS on language production tasks that measure word reading and picture naming. We reviewed 14 papers measuring tDCS effects across a total of 96 conditions to a) quantify effects of conventional stimulation on language regions (i.e., left hemisphere anodal tDCS administered to temporal/frontal areas) under normal conditions or under conditions of cognitive (semantic) interference; b) identify parameters which may moderate the size of the tDCS effect within conventional stimulation protocols (e.g., online vs offline, high vs. low current densities, and short vs. long durations), as well as within types of stimulation not typically explored by previous reviews (i.e., right hemisphere anodal tDCS or left/right hemisphere cathodal tDCS). In all analyses there was no significant effect of tDCS, but we did find a small but significant effect of time and duration of stimulation with stronger effects for offline stimulation and for shorter durations (< 15 min). We also found some indication of publication bias towards reporting positive effects. We encourage further experimentation in order resolve the disparity between the current popularity of tDCS and its poor efficacy in healthy participants
Modern Electronic Techniques Applied to Physics and Engineering
Contains reports on three research projects
Methodology for the nocturnal cardiac arrhythmia ancillary study of the ADVENT-HF trial in patients with heart failure with reduced ejection fraction and sleep-disordered breathing
Background
Sleep disordered breathing (SDB) may trigger nocturnal cardiac arrhythmias (NCA) in patients with heart failure with reduced ejection fraction (HFrEF). The NCA ancillary study of the ADVENT-HF trial will test whether, in HFrEF-patients with SDB, peak-flow-triggered adaptive servo-ventilation (ASVpf) reduces NCA. To this end, accurate scoring of NCA from polysomnography (PSG) is required.
Objective
To develop a method to detect NCA accurately from a single-lead electrocardiogram (ECG) recorded during PSG and assess inter-observer agreement for NCA detection.
Methods
Quality assurance of ECG analysis included training of the investigators, development of standardized technical quality, guideline-conforming semi-automated NCA-scoring via Holter-ECG software and implementation of an arrhythmia adjudication committee. To assess inter-observer agreement, the ECG was analysed by two independent investigators and compared for agreement on premature ventricular complexes (PVC) /h, premature atrial complexes/h (PAC) as well as for other NCA in 62 patients from two centers of the ADVENT-HF trial.
Results
The intraclass correlation coefficients for PVC/h and PAC/h were excellent: 0.99 (95%- confidence interval [CI]: 0.99–0.99) and 0.99 (95%-CI: 0.97–0.99), respectively. No clinically relevant difference in inter-observer classification of other NCA was found. The detection of non-sustained ventricular tachycardia (18% versus 19%) and atrial fibrillation (10% versus 11%) was similar between the two investigators. No sustained ventricular tachycardia was detected.
Conclusion
These findings indicate that our methods are very reliable for scoring NCAs and are adequate to apply for the entire PSG data set of the ADVENT-HF trial
- …